1. What you will do in this R Notebook

1. Introduction

This is the second day of your training in R for Machine Learning. You will now enter the wide, wild and exciting field of Deep Learning! Neural Networks and Deep Learning have revolutionized the performance of many Machine Learning tasks in image analysis, video analysi, natural langage processing and many other “traditional” classification, regression or clustering tasks.

As for the first lab, you will be dealing with a huge amount of variables (genes expression per patient) compared to the number of patients. You again would like to provide the clinician with a simple output that he can use for his patients.

1.2 Main steps

In this notebook, you will be guided through TCGA data and images from the MNIST data. From expression data we will try to implement a neural network to perform classification tasks while from images we will also implement a (convolutional) neural network to also perform classification.

Neural networks and machine learning are at the heart of current innovations! To go further, join us for exciting research projects!

1.3 In practice

The code is provided to you. You will just have to follow the instructions all along the notebook. Explanations and corrections will be given throughout the notebook for a subset of the questions.

IMPORTANT There are questions in this notebook that you need to answer and code that you need to write by yourself during the lab or at home. The ones you have to do during the lab are be indicated by a tag “INCLASS WORK”, and the ones you have to after the class are indicated by “HOME WORK”.

2. Build your first neural network with Keras.

2.1 About Keras.

Keras is a Python deep learning library designed to help human build and train neural networks of many different types. It builds on top of Tensorflow 2, an end-to-end open source platform for machine leanring. The Keras has an R interface through the Keras R package which will be using in this lab.

The math is pre-computed, and functions are relatively user-friendly: is a high level API. It can use several lower level back-ends for the math and calculation. Keras and Tensorflow have been develloped by Google and are open source.

In order to install Keras using the R package, you will need Anaconda and a Python environment with tensorflow 2 installed. If Anaconda (or Miniconda) is not installed on your computer, download and install it. Then, run the following chunk that will create a conda environment named r_tensorflow with the libraries required to run Keras.

library(keras)
# keras::install_keras(method="conda", envname="r_tensorflow")
use_condaenv(condaenv="r_tensorflow")

You can check with the command conda info --envs run from a command line interface that an environment named r_tensorflow has been created. After running this cell, restart your R studio. In your new session, comment the line keras::install_keras and uncomment the line use_condaenv. Check that your installation is correct by verifying that the next chunk does not return an error.

model <- keras_model_sequential()

For the next steps we will be using some R libraries. Load them with the following

library(RColorBrewer)
library(knitr)

We will also use some useful functions.

getColors <- function(vec, pal="Dark2", alpha=0.7){
  colors <- list()
  palette <- brewer.pal(max(length(unique(vec)),3), pal)
  i <- 1
  for (v in unique(vec)){
    colors[[as.character(v)]] <- adjustcolor(palette[i], alpha)
    i <- i+1
  }
  colors
}

getConfusionMatrix <- function(labelsPredicted, labelsCorrect, labels=NULL){
  if (is.null(labels)){
    labels <- sort(unique(union(labelsPredicted, labelsCorrect)))
  } else {
    labels <- sort(unique(labels))
  }
  confMat <- data.frame(row.names=labels)

  for (labelPredicted in labels){
    for (labelCorrect in labels){
      confMat[labelPredicted, labelCorrect] <- sum(labelsPredicted==labelPredicted & labelsCorrect==labelCorrect)
    }
  }

  confMat
}

2.2 Load the data

As in the previous lab, we will be using data from the TCGA project. More specifically we will continue to load expression data restricted to genes from the Cancer Gene Census.

source("../../lib/LoadcBioportal.R")
CgcTable <- read.csv("../data/CancerGeneCensusCOSMIC_20210120.csv")
CgcGenes <- CgcTable$Gene.Symbol

TcgaData <- LoadcBioportal(Organ="luad|lusc",
                           Genes=CgcGenes,
                           ClinicNeeded=T,
                           MutNeeded=F,
                           RNANeeded=T,
                           NormalizeRNA=T,
                           FunctionalAnnot=F)
## Warning: package 'cgdsr' was built under R version 4.0.3
## Please send questions to cbioportal@googlegroups.com
## [1] "Pooling : luad_tcga_pub & lusc_tcga"
## [1] "Discard C15orf65, IGH, IGK, IGL, NUTM2B, TEC, TRA, TRB, TRD because NAs in RNAseq"
print(table(TcgaData$CLINIC$study))
## 
## luad lusc 
##  230  178
colorsStudy <- getColors(TcgaData$CLINIC$study, alpha=0.7)

TcgaData$CLINIC$study_num <- ifelse(TcgaData$CLINIC$study=="luad",0, 1)

2.3. Visualize your data

As already explained in the previous lab, it is an important step than visualizing your data and performing some vanilla analyses. Let us reproduce the same plots as for the luad|gbm comparison.

The distribution plot

hist(x=as.matrix(TcgaData$EXP[TcgaData$CLINIC$study=="luad",]), col=colorsStudy[["luad"]],
     main="Distribution of gene expressions", xlab="Normalized counts", freq = F)
hist(as.matrix(TcgaData$EXP[TcgaData$CLINIC$study=="lusc",]), add=T, col=colorsStudy[["lusc"]], freq = F)
legend("topright",legend=c("luad","lusc"), fill=2:3, cex=0.8)

The heatmap plot

rowNames <- rownames(TcgaData$EXP)
rowStudy <- TcgaData$CLINIC[rowNames, "study"]
rowSideColors <- sapply(rowStudy, function(x) colorsStudy[[x]])
heatmap(as.matrix(TcgaData$EXP), Rowv=NULL, Colv=NULL, keep.dendro=F, RowSideColors=rowSideColors)
TCGA expression data LUAD and LUSC

TCGA expression data LUAD and LUSC

The PCA plot

X <- t(t(TcgaData$EXP) - rowMeans(t(TcgaData$EXP)))
resSVD <- svd(X, nu=2, nv=2)
# the scores are the principal components i.e the 
# low-dimensional representations of the samples
scores <- resSVD$u %*% diag(resSVD$d[1:2])
# plot the two first principal components
plot(scores[,1],scores[,2],
     col=unlist(colorsStudy[TcgaData$CLINIC$study]), 
     pch=16,main='PCA representation',
     xlab=paste("dim1 = ",100*round(resSVD$d[1]^2/sum(resSVD$d^2),3),"%",sep = ""),
     ylab=paste("dim2 = ",100*round(resSVD$d[2]^2/sum(resSVD$d^2),3),"%",sep = ""))
legend("bottomright",legend=c("luad","lusc"), pch=16, cex=0.8, col=2:3)
PCA TCGA expression data LUAD and LUSC

PCA TCGA expression data LUAD and LUSC

2.4 Build a neural network

Train/test split

For the purpose of assessing correctly the generalization capabilities of our future model we split the data into a train and a test part.

studySize <- nrow(TcgaData$CLINIC)
trainProp <- 0.8
trainIndex <- sample(seq(studySize), size=round(studySize*trainProp))
testIndex <-seq(studySize)[!seq(studySize) %in% trainIndex]

print(paste("Training size:", length(trainIndex)))
## [1] "Training size: 326"
print(paste("Test size:", length(testIndex)))
## [1] "Test size: 82"

Model building

Building the neural network means building individual layers and connecting them through different methods. Once your have build your network, you can compile and then fit it on the data.

The basic building block of a neural network is the layer. Layers sequentially learn representations of the data fed into them and outputs their new representation to the following layer until the last layer tha provides a representation suitable for performing a given tasks.

INCLASS WORK Define a neural network by defining a sequence of dense layers. For this first exercice, you have build a network with only one hidden layer. The input layer should take as input a vector of size the number of genes in the TcgaData$EXP table while the output layer should produce a 2D vector of “class conditional probabilities” (neural networks outputs are not in general probabilities).

Hint: In order to build your first model you will need the following functions 1. keras_model_sequential to initialize the model object. 2. layer_dense to create dense layers (i.e fully connected).

You may find information in the online documentation of the R keras package.

#### YOUR WORK HERE
input_shape <- ncol(TcgaData$EXP)
model <- keras_model_sequential()
model %>%
  layer_dense(input_shape = input_shape, units = 128, activation = 'relu') %>%
  layer_dense(units = 30, activation = 'relu') %>%
  layer_dense(units = 1, activation = 'sigmoid')

You can visualize your model architecture by just printing the model as in the following chunk

print(model)
## Model
## Model: "sequential_1"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## dense_2 (Dense)                     (None, 128)                     91520       
## ________________________________________________________________________________
## dense_1 (Dense)                     (None, 30)                      3870        
## ________________________________________________________________________________
## dense (Dense)                       (None, 1)                       31          
## ================================================================================
## Total params: 95,421
## Trainable params: 95,421
## Non-trainable params: 0
## ________________________________________________________________________________

Compile the model

Before the model is ready for training, it needs a few more settings. These are added during the model’s compilation step. In particular, you need to specify the following

  1. The loss function - This measures how accurate the model is during training. We want to minimize this function to “steer” the model in the right direction. Careful designing of the loss function allows to train networks with specific properties (regularized for instance).

  2. Optimizer - This is the numerical methods used to update the network’s weights so as to minimize the loss function.

  3. Metrics - Used to monitor the training and testing steps. The following example uses accuracy, the fraction of the samples that are correctly classified.

INCLASS WORK Apply the compilation step on your model by specifying the parameters mentioned above. You should use adam as an optimizer, binary_crossentropy as a loss function and accuracy as a metric.

Hint: Look into this example and into the documentation of the compile function.

#### YOUR WORK HERE
model %>% compile(
  optimizer = 'adam', 
  loss = 'binary_crossentropy',
  metrics = c('accuracy')
)

Train your network

In order to train your neural network, you will have to provide it with the training data (predictive variables and predicted variable) as well as extra parameters specifiying the validation_split for assessing the model performance during the training, the number of epochs or the batch size.

  1. What are epochs? What is the batch size? What could be the advantage of reducing or increasing the batch size?

ANSWER:

 

INCLASS WORK Use the fit method to train your neural network on training data. Train it on 50 epochs using a validation split of 20%.

#### YOUR CODE HERE
model %>% fit(
  x = as.matrix(TcgaData$EXP[trainIndex,]),
  y = TcgaData$CLINIC$study_num[trainIndex], 
  epochs = 50, 
  validation_split = 0.2, 
  shuffle = FALSE)

As the model trains, the loss and accuracy metrics are displayed. This model reaches an accuracy of about XX on the training data.

Evaluate accuracy

Next, compare how the model performs on the test dataset:

INCLASS WORK Use the evaluate method to evaluate the quality of your neural network on test data. How does it compare to the train accuracy? The validation accuracy? Use the predict_classes method to compute the confusion matrices on the train and test sets.

#### YOUR WORK HERE
scoreTrain <- model %>% evaluate(as.matrix(TcgaData$EXP[trainIndex,]), TcgaData$CLINIC$study_num[trainIndex])
scoreTest <- model %>% evaluate(as.matrix(TcgaData$EXP[testIndex,]), TcgaData$CLINIC$study_num[testIndex])
studyPredictedTrain <- predict_classes(model, x=as.matrix(TcgaData$EXP[trainIndex,]))
studyPredictedTest <- predict_classes(model, x=as.matrix(TcgaData$EXP[testIndex,])) 

studyPredictedTrain <- ifelse(studyPredictedTrain==0, "luad", "lusc")
studyPredictedTest <- ifelse(studyPredictedTest==0, "luad", "lusc")

confMatTrain <- getConfusionMatrix(studyPredictedTrain, TcgaData$CLINIC$study[trainIndex])
confMatTest <- getConfusionMatrix(studyPredictedTest, TcgaData$CLINIC$study[testIndex])

ktrain <- kable(confMatTrain, caption="Train")
ktest <- kable(confMatTest, caption="Test")
kables(list(ktrain, ktest), format="html", caption="Confusion matrices")
Confusion matrices
Train
luad lusc
luad 181 1
lusc 1 143
Test
luad lusc
luad 47 3
lusc 1 31

INCLASS WORK Experiment with other architectures by adding more layers, changing the layer sizes and/or adding regularization methods like dropout or batch normalization. Read about these online or in the documentation.

#### YOUR WORK HERE
input_shape <- ncol(TcgaData$EXP)
model <- keras_model_sequential()
model %>%
  layer_dense(input_shape = input_shape, units = 256, activation = 'relu') %>%
  layer_dense(units = 128, activation = 'relu') %>%
  layer_dropout(rate=0.2) %>%
  layer_dense(units = 64, activation = 'relu') %>%
  layer_dropout(rate=0.2) %>%
  layer_dense(units = 16, activation = 'relu') %>%
  layer_dropout(rate=0.1) %>%
  layer_dense(units = 1, activation = 'sigmoid')
print(model)
## Model
## Model: "sequential_2"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## dense_7 (Dense)                     (None, 256)                     183040      
## ________________________________________________________________________________
## dense_6 (Dense)                     (None, 128)                     32896       
## ________________________________________________________________________________
## dropout_2 (Dropout)                 (None, 128)                     0           
## ________________________________________________________________________________
## dense_5 (Dense)                     (None, 64)                      8256        
## ________________________________________________________________________________
## dropout_1 (Dropout)                 (None, 64)                      0           
## ________________________________________________________________________________
## dense_4 (Dense)                     (None, 16)                      1040        
## ________________________________________________________________________________
## dropout (Dropout)                   (None, 16)                      0           
## ________________________________________________________________________________
## dense_3 (Dense)                     (None, 1)                       17          
## ================================================================================
## Total params: 225,249
## Trainable params: 225,249
## Non-trainable params: 0
## ________________________________________________________________________________

Let’s try to fit this architecture!

model %>% compile(
  optimizer = 'adam', 
  loss = 'binary_crossentropy',
  metrics = c('accuracy')
)

history <- model %>% fit(
  x = as.matrix(TcgaData$EXP[trainIndex,]),
  y = TcgaData$CLINIC$study_num[trainIndex], 
  epochs = 50, 
  validation_split = 0.2, 
  shuffle = FALSE)
plot(history)
## `geom_smooth()` using formula 'y ~ x'

studyPredictedTrain <- predict_classes(model, x=as.matrix(TcgaData$EXP[trainIndex,]))
studyPredictedTest <- predict_classes(model, x=as.matrix(TcgaData$EXP[testIndex,])) 

studyPredictedTrain <- ifelse(studyPredictedTrain==0, "luad", "lusc")
studyPredictedTest <- ifelse(studyPredictedTest==0, "luad", "lusc")

confMatTrain <- getConfusionMatrix(studyPredictedTrain, TcgaData$CLINIC$study[trainIndex])
confMatTest <- getConfusionMatrix(studyPredictedTest, TcgaData$CLINIC$study[testIndex])

ktrain <- kable(confMatTrain, caption="Train")
ktest <- kable(confMatTest, caption="Test")
kables(list(ktrain, ktest), format="html", caption="Confusion matrices")
Confusion matrices
Train
luad lusc
luad 181 1
lusc 1 143
Test
luad lusc
luad 47 2
lusc 1 32

3. Build your first CNN

Convolutional neural networks are a specific type of neural networks that use a reduced number of parameters compared to conventional feed-forward neural network by exploiting the local structure that exists in the input data. These networks are particularly performan on images which are highly structured data.

In this exercice you will be training a Convolutional neural network on the now famous MNIST data set.

3.1 Load the MNIST data

mnist   <- keras::dataset_mnist()
x_train <- mnist$train$x          # 3d array (60000,28,28)
y_train <- mnist$train$y          # 1d array (60000)
x_test  <- mnist$test$x           # 3d array (10000,28,28)
y_test  <- mnist$test$y           # 1d array (60000)

colorsDigit <- getColors(y_train, pal="Paired")

3.2 Visualize some examples

As usual, let’s have a look at the data

par(mfcol=c(6,6))
par(mar=c(0, 0, 0, 0), xaxs='i', yaxs='i')
for (idx in 1:36) { 
    im <- x_train[idx,,]
    im <- t(apply(im, 2, rev)) 
    image(1:28, 1:28, 1-im, ann=F, xaxt="n", yaxt="n", col=gray((0:255)/255), main=NULL)
}
36 Images from MNIST

36 Images from MNIST

For the needs of the model and the next step, the following chunk reshapes the data into matrices by flattening the 28x28 images into 756-long vectors and one-hot encoding the observation labels.

x_train <- keras::array_reshape(x_train, c(nrow(x_train), 784))
x_test  <- keras::array_reshape(x_test, c(nrow(x_test), 784))
x_train <- x_train/255
x_test  <- x_test/255
y_train_cat <- keras::to_categorical(y_train, 10)
y_test_cat  <- keras::to_categorical(y_test, 10)

To begin with, let’s go for a naive strategy and run a PCA analysis on the input training samples to see how they cluster.

INCLASS WORK Reuse the code you already saw in order to run a PCA on the MNIST data set and see how the digits clusters in the first principal components.

#### YOUR WORK HERE
X <- t(t(x_train) - rowMeans(t(x_train)))
resSVD <- svd(X, nu=4, nv=4)
scores <- resSVD$u %*% diag(resSVD$d[1:4])
par(mfcol=c(1,2))

plot(scores[,1],scores[,2],
     col=unlist(colorsDigit[as.character(y_train)]), 
     pch=16,main='PCA representation',
     xlab=paste("dim1 = ",100*round(resSVD$d[1]^2/sum(resSVD$d^2),3),"%",sep = ""),
     ylab=paste("dim2 = ",100*round(resSVD$d[2]^2/sum(resSVD$d^2),3),"%",sep = ""))
legend("bottomright",legend=names(colorsDigit), pch=16, cex=0.8, col=unlist(colorsDigit))

plot(scores[,3],scores[,4],
     col=unlist(colorsDigit[as.character(y_train)]), 
     pch=16,main='PCA representation',
     xlab=paste("dim3 = ",100*round(resSVD$d[3]^2/sum(resSVD$d^2),3),"%",sep = ""),
     ylab=paste("dim4 = ",100*round(resSVD$d[4]^2/sum(resSVD$d^2),3),"%",sep = ""))
legend("bottomright",legend=names(colorsDigit), pch=16, cex=0.8, col=unlist(colorsDigit))
PCA MNIST digits

PCA MNIST digits

3.3 Build your “traditional model”

INCLASS WORK Build first a “traditional” neural network, compile and train it the MNIST data set.

#### YOUR WORK HERE
#### define the architecture
model <- keras::keras_model_sequential()
model %>%
    layer_dense(units=256, activation='relu', input_shape=c(784)) %>%
    layer_dropout(rate=0.4) %>%
    layer_dense(units=128, activation='relu') %>%
    layer_dropout(rate=0.3) %>%
    layer_dense(units=10, activation='softmax')
summary(model)
## Model: "sequential_3"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## dense_10 (Dense)                    (None, 256)                     200960      
## ________________________________________________________________________________
## dropout_4 (Dropout)                 (None, 256)                     0           
## ________________________________________________________________________________
## dense_9 (Dense)                     (None, 128)                     32896       
## ________________________________________________________________________________
## dropout_3 (Dropout)                 (None, 128)                     0           
## ________________________________________________________________________________
## dense_8 (Dense)                     (None, 10)                      1290        
## ================================================================================
## Total params: 235,146
## Trainable params: 235,146
## Non-trainable params: 0
## ________________________________________________________________________________
#### compile the model - specify the loss, the metric and the optimizer
model %>% compile(
    loss      = 'categorical_crossentropy',
    optimizer = keras::optimizer_rmsprop(),
    metrics   = c('accuracy')
)

A very nice way to visualize the evolution of the network accuracy is through the learning curves which may take multiple forms. In particular, one type of learning curve is that of plotting the score of your model as it moves through epoch. You can do this plot very simply by simply applying the plot function on the history object produced by the fit function from the previous chunk.

plot(history)
## `geom_smooth()` using formula 'y ~ x'
Learning curves

Learning curves

Assess the performance of your model using the evaluate and predict_classes function.

#### get loss and accuracy on the test set
eval_test <- model %>% evaluate(
    x       = x_test,
    y       = y_test_cat,
    verbose = 0
)
print(paste("Loss on test set:", round(eval_test["loss"], 5), sep=" "))
## [1] "Loss on test set: 0.10843"
print(paste("Accuracy on test set:", round(eval_test["accuracy"], 5), sep=" "))
## [1] "Accuracy on test set: 0.9815"

Confusion matrices

predictedTrain <- predict_classes(model, x=as.matrix(x_train))
predictedTest <- predict_classes(model, x=as.matrix(x_test)) 

confMatTrain <- getConfusionMatrix(as.character(predictedTrain), as.character(y_train))
confMatTest <- getConfusionMatrix(as.character(predictedTest), as.character(y_test))

ktrain <- kable(confMatTrain, caption="Train")
ktest <- kable(confMatTest, caption="Test")
kables(list(ktrain, ktest), format="html", caption="Confusion matrices")
Confusion matrices
Train
0 1 2 3 4 5 6 7 8 9
0 5912 0 4 3 1 2 5 1 6 4
1 0 6727 4 2 7 0 2 9 13 5
2 3 4 5923 17 0 4 1 8 6 0
3 0 1 7 6086 0 15 0 4 6 8
4 1 3 2 0 5821 2 2 6 1 22
5 1 0 1 10 0 5372 3 2 6 6
6 1 2 3 0 2 13 5903 0 3 0
7 0 3 4 2 1 3 0 6228 3 17
8 2 2 10 5 0 5 2 1 5801 6
9 3 0 0 6 10 5 0 6 6 5881
Test
0 1 2 3 4 5 6 7 8 9
0 974 0 2 0 1 2 6 1 3 2
1 1 1129 2 0 1 0 2 7 3 4
2 0 0 1014 6 1 0 0 10 4 0
3 1 2 1 995 0 9 1 1 3 5
4 0 1 1 0 964 1 3 0 4 12
5 1 0 0 3 0 869 3 0 3 3
6 1 1 1 0 5 5 942 0 3 0
7 1 1 6 3 0 0 0 1003 2 2
8 1 1 5 2 2 4 1 2 944 0
9 0 0 0 1 8 2 0 4 5 981

3.4 Build your CNN model

mnist   <- keras::dataset_mnist()
x_train <- mnist$train$x          # 3d array (60000,28,28)
y_train <- mnist$train$y          # 1d array (60000)

x_test  <- mnist$test$x           # 3d array (10000,28,28)
y_test  <- mnist$test$y           # 1d array (60000)

x_train <- x_train/255
x_test  <- x_test/255

x_train <- array_reshape(x_train, c(nrow(x_train), 28, 28, 1)) # 4d array (60000, 28, 28, 1)
x_test <- array_reshape(x_test, c(nrow(x_test), 28, 28, 1)) # 4d array (60000, 28, 28, 1)

y_train_cat <- keras::to_categorical(y_train, 10)
y_test_cat  <- keras::to_categorical(y_test, 10)

INCLASS WORK Build the convolutional neural network displayed in the image above, compile and train it the MNIST data set. Has your performance improved?

#### YOUR WORK HERE
model <- keras_model_sequential()
model %>%
    layer_conv_2d(kernel_size=c(5,5), filters=4, padding="valid", activation='relu', input_shape=c(28,28,1)) %>%
    layer_max_pooling_2d(pool_size=c(2,2)) %>%
    layer_conv_2d(kernel_size=c(5,5), filters=10, padding="valid", activation='relu') %>%
    layer_max_pooling_2d(pool_size=c(2,2)) %>%
    layer_flatten() %>%
    layer_dense(units=64, activation='relu') %>%
    layer_dense(units=10, activation='softmax')
summary(model)
## Model: "sequential_4"
## ________________________________________________________________________________
## Layer (type)                        Output Shape                    Param #     
## ================================================================================
## conv2d_1 (Conv2D)                   (None, 24, 24, 4)               104         
## ________________________________________________________________________________
## max_pooling2d_1 (MaxPooling2D)      (None, 12, 12, 4)               0           
## ________________________________________________________________________________
## conv2d (Conv2D)                     (None, 8, 8, 10)                1010        
## ________________________________________________________________________________
## max_pooling2d (MaxPooling2D)        (None, 4, 4, 10)                0           
## ________________________________________________________________________________
## flatten (Flatten)                   (None, 160)                     0           
## ________________________________________________________________________________
## dense_12 (Dense)                    (None, 64)                      10304       
## ________________________________________________________________________________
## dense_11 (Dense)                    (None, 10)                      650         
## ================================================================================
## Total params: 12,068
## Trainable params: 12,068
## Non-trainable params: 0
## ________________________________________________________________________________
model %>% compile(
    loss      = 'categorical_crossentropy',
    optimizer = 'adam',
    metrics   = c('accuracy')
)
plot(history)
## `geom_smooth()` using formula 'y ~ x'
Learning curves

Learning curves

Confusion matrices

predictedTrain <- predict_classes(model, x=x_train)
predictedTest <- predict_classes(model, x=x_test)

confMatTrain <- getConfusionMatrix(as.character(predictedTrain), as.character(y_train))
confMatTest <- getConfusionMatrix(as.character(predictedTest), as.character(y_test))

ktrain <- kable(confMatTrain, caption="Train")
ktest <- kable(confMatTest, caption="Test")
kables(list(ktrain, ktest), format="html", caption="Confusion matrices")
Confusion matrices
Train
0 1 2 3 4 5 6 7 8 9
0 5898 4 14 5 4 17 28 1 18 13
1 0 6685 3 0 5 0 1 17 17 5
2 7 22 5898 25 5 5 5 36 29 2
3 2 10 19 6052 1 39 2 15 44 27
4 2 2 6 0 5790 4 7 9 13 37
5 4 0 0 16 0 5311 11 0 23 12
6 2 5 0 1 9 20 5857 0 8 0
7 3 8 11 7 3 2 0 6158 9 33
8 1 5 6 18 5 13 7 3 5664 12
9 4 1 1 7 20 10 0 26 26 5808
Test
0 1 2 3 4 5 6 7 8 9
0 975 0 1 1 1 4 9 1 7 5
1 0 1126 1 0 1 0 2 3 0 5
2 1 4 1020 2 2 0 0 12 3 2
3 0 1 4 999 0 15 0 5 8 6
4 0 0 1 0 971 0 6 0 0 11
5 1 1 0 4 0 866 5 0 1 2
6 2 2 0 0 1 2 935 0 1 0
7 1 1 4 1 1 1 0 1003 3 8
8 0 0 0 2 0 2 1 0 949 1
9 0 0 1 1 5 2 0 4 2 969